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A new database containing crystallo- 
graphic and chemical information de- 
signed especially for application to 
electron diffraction search/match and 
related problems has been developed. 
The new database was derived from 
two well-established x-ray diffraction 
databases, the JCPDS Powder Diffrac- 
tion File and NBS CRYSTAL DATA, 
and incorporates 2 years of experience 
with an earlier version. It contains 
71,142 entries, with space group and 
unit cell data for 59,612 of those. Unit 
cell and space group information were 
used, where available, to calculate pat- 
terns consisting of all allowed reflec- 
tions with flf-spacings greater than 0.8 A 
for ~ 59,000 of the entries. Calculated 
patterns are used in the database in pref- 
erence to experimental x-ray data when 
both are available, since experimental x- 
ray data sometimes omits high af -spacing 
data which falls at low diffraction an- 
gles. Intensity data are not given when 
calculated spacings are used. A search 



scheme using chemistry and r-spacing 
(reciprocal rf-spacing) has been devel- 
oped. Other potentially searchable data 
in this new database include space 
group, Pearson symbol, unit cell edge 
lengths, reduced cell edge length, and 
reduced cell volume. Compound and/or 
mineral names, formulas, and journal 
references are included in the output, as 
well as pointers to corresponding entries 
in NBS CRYSTAL DATA and the 
Powder Diffraction File where more 
complete information may be obtained. 
Atom positions are not given. Rudimen- 
tary search software has been written to 
implement a chemistry and /--spacing bit 
map search. With typical data, a full 
search through ~ 7 1,000 compounds 
takes 10-20 seconds on a PDP 11/23- 
RL02 system. 
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Introduction 



The identification of ciystalline objects in the 
size range from 10 jum to 10 A is readily accom- 
plished in the analytical electron microscope 
(AEM) if the analyst has access to appropriate in- 
formation. Most often the needed information ex- 
sists, but either it is not readily accessible in the 
laboratory or it is not in the most useful form. Ac- 
quiring and reprocessing reference data is often the 
time-limiting step in the identification process. In- 
formation scattered through the open literature has 
been collected into compilations which recently 



have become available in computer-readable form 
[1,2]. Even so, the format of the data is not ideally 
suited for electron diffraction work [3]. 

We perceived a need for a specialized database 
to support efficient phase identification by com- 
bined electron diffraction and energy dispersive x- 
ray spectroscopy (EDS) in a modern analytical 
electron microscope. Considering the quality of the 
experimental data obtainable from the AEM, the 
quantity of reference data, and available computing 
machinery, we set out to create a database to sup- 
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port search/match procedures [4] and crystallo- 
graphic calculations [5] performed routinely in our 
laboratories. 



Description of the Database 

This database was derived from two copyrighted 
databases, NBS CRYSTAL DATA and the PDF- 
2. The preparation of the derivative database was 
facilitated by the fact that the original databases are 
in the same format since both were built with a 
program called NBS*AIDS83 [6]. The new deriva- 
tive database contains a subset of information from 
the full databases, selected on the basis of perti- 
nence to electron diffraction analysis. Only inor- 
ganic compounds were used [7]. The data is 
accurate and as complete as possible, but has been 
reduced in precision to a level appropriate for elec- 
tron diffraction work (±~1%@1.5 A). It has 
been packed in a manner which allows it to be used 
on a small computer equipped with a 10 Mb hard 
disk. The database is complete so that it is useful 
without reference to other sources such as cards [1] 
or books [1,2], but it contains pointers so that if a 
card file [1], CDROM [8], or other full listing [1,2] 
is available, one can quickly get to that information 
as well. 

The data were selected from the two sources as 
follows: 

1. All inorganic compounds from NBS 
CRYSTAL DATA were used. The unit 
cell and space group information from each 
compound were used to compute up to 60 
non-redundant allowed reflections with d- 
spacings greater than 0.8 A. Intensities 
were not computed. There are 59,612 en- 
tries of this type. 

2. Inorganic compounds from PDF-2 sets 1- 
33 whose entries do not give unit cell data, 
and all entries from sets 34-36 were used. 
These are only a subset of the full PDF-2 
database. It was assumed that entries hav- 
ing unit cell information in sets 1-33 are ad- 
equately represented by similar entries in 
NBS CRYSTAL DATA and would only 



duplicate information, cf-spacings and in- 
tensities (obtained from x-ray methods) 
were used. All inorganic compounds from 
PDF-2 sets 34-36 were used whether or not 
they contained unit cell information, since 
it could not be assessed whether such com- 
pounds had been included in NBS CRYS- 
TAL DATA yet. (A little duplication is 
better than missing a compound alto- 
gether.) This group contains 11,530 entries. 
Despite their different origins, the two types of 
source data are functionally equivalent and are 
treated equally in the new database. They are min- 
gled in the ordered and indexed Search file. The 
computed data (1.) represents the best target group 
for matching on the basis of observed cf-spacings 
from single diffraction patterns. The data in group 
(2.) is similar to the data obtainable from the PDF 
Level I database, an earlier version of the PDF-2 
used in this work. We have searched against type 
(2.) data for over two years with fair success [3]. 
When searching failed, it was often because the ex- 
perimental x-ray observations in the PDF Level I 
database did not include high cf -spacing reflections 
observable by electron diffraction. The computed 
data in (1.) is an attempt to correct this weakness, 
but computation is not possible for compounds in 
(2.) because unit cell and space group information 
is absent. The data in (2.) is valuable nonetheless, 
because even if you cannot completely character- 
ize such a material, at least you can determine that 
"you found it again." The literature reference may 
be of some use in such cases. 

As in the earlier version of this work, data are 
stored in two types of files: Reference files and a 
Search file. We have kept sufficient information in 
each entry to be of use for electron diffraction anal- 
ysis, but have put only certain critical information 
in the Search file, for the sake of speed. The data 
for each compound, therefore, is divided between 
the Search file and a Reference file. There may be 
more than one entry for a given compound. Multi- 
ple entries for the same compound are present 
mainly when derived from different literature cita- 
tions. 



The contents of a Reference file entry for a given compound are: 

1. Name length (1 byte) Number of bytes (x) to store the com- 

pound name. 

2. Formula length (1 byte) Number of bytes (y) to store the com- 

pound formula. 

3. # of intensities (1 byte) Number of reflections (z) having intensi- 

ties (if computed, then 0). 
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4. Unit cell angles (4 bytes) 

5. Reduced cell angles (4 bytes) 

6. Pearson symbol (4 bytes) 

7. Journal reference (17 bytes) 

8. Source ID (3 bytes) 

9. Unit cell angles (0-12 bytes) 

10. Reduced cell angles (0-12 bytes) 

11. Compound name (x bytes) 

12. Compound formula (y bytes) 

13. Intensitites (>z/2bytes) 



The first eight items are fixed length fields; the last 
five vary in length and may be absent. The entries 
in the Reference files therefore vary in length. An- 
gles are multiplied by 100 and rounded to convert 
them to integers, which take less storage than float- 
ing point numbers while preserving two decimal 
place precision. Only angles not equal to 90 de- 
grees are stored, with a code indicating whether 
they represent a, 13, or y. Missing angles are always 
90°. The compound names are converted to Radix- 
50 notation which encodes 3 characters per 2 bytes 
(50% denser than packed character strings). Refer- 



Number and kind of angles given for the 
conventional unit cell. 

Number and kind of angles given for the 
reduced cell [4]. 

xXnnnn, indicates crystal class, symme- 
try, and number of atoms in the con- 
ventional unit cell. 

CODEN, volume, page, and first 9 char- 
acters of the author name field (Radix- 
50), and year (-1800). 

PDF number or CRYSTAL DATA ID 
number. 

Degrees* 100, only the necessary ones. 

Degrees* 100, only the necessary ones. 

Including mineral name, if applicable 
(Radix 50). 

Functional formula (ASCII). 

In nibbles, if present (a nibble = 4 
bits =1/2 a byte) always ending on a 
word boundary 

ence entries are grouped together in 16 Reference 
files, each of which contain a large number (2000- 
5000) entries. Twelve reference files contain data 
from NBS CRYSTAL DATA entries, and four 
files contain data from PDF-2 compounds. There is 
a pointer to the corresponding Reference file entry 
stored in each Search file entry. The Reference 
files are not meant to be searched, but rather to be 
directly accessed one time, after a search has been 
completed. In total, these files require ~4.5 Mb of 
disk. 



The contents of a Search file entry for a given compound are; 



1. Chemistry bit map (12 bytes); 

2. r-spacing bit map (22 bytes); 



3. Space group (2 bytes); 



4. Unit cell edges (6 bytes); 



Elements 1-96 in six 16-bit words. 

Eleven 16-bit words, representing 176 cells, 
each 0.018 A wide, of r-spacing (r = XL/ 
d, where XL =2.5 A-cm). At XL = 2. 5 A- 
cm, r-spacings range from 0.0 to 3.2 cm, 
representing J-spacings from oo to 0.8 
A). 

Encoded in two bytes, allows for *nnnX 
*, if present, signifies that the space group 
is not completely determined, so an as- 
pect is given [6]. 
nnn is the space group number (1-230). 
X, if present, gives the setting, e.g., Pbca 
or Pcab, etc. 

A* 100, three 16-bit integers, the dimensions 
of the unit cell given in NBS CRYSTAL 
DATA, which may be different from the 
unit cell assigned by the original author. 
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5. Reduced cell edges (6 bytes): 



6. Reduced cell volume (2 bytes): 

7. Flags (8 bits): 



8. Pointer (3 bytes): 

9. Spare word (2 bytes): 

This is a single large file (~4 Mb). The entries are 
ordered on the basis of composition, beginning 
with atomic number 1 1 (sodium). We have as- 
sumed an EDS detector with a beryllium window 
is being used. This is the most common type of 
detector in the field today. It is capable of detect- 
ing only elements whose characteristic x rays are 
hard enough to penetrate the Be window (namely 
Z> 11). This orders the file on the basis of EDS-de- 
tectable qualitative chemistry, scattering oxides, 
carbides, etc., through the file associated with their 
EDS-observable elements. This ordering is advan- 
tageous even when using an EDS detector that can 
detect lighter elements, because the light elements 
are so common in compounds in the file as to be a 
disadvantage when searching. For example, oxy- 
gen is present in more than half of all the com- 
pounds in the file, so it is much more efficient to go 
looking for iron-bearing compounds (5909) that 
contain oxygen (3837), than oxygen-bearing com- 
pounds (40084) that contain iron (3837). The order- 
ing scheme also places compounds containing only 
undetectable light elements (e.g., ice, graphite, 
boron nitride) at the end of the file, where they 
may be skipped as a group if so desired. Each entry 
in this file is a fixed length (56 bytes). Entries are 
grouped into records. There are 18 entries in each 
record, followed by 16 empty bytes to pad the 
record length to 1024 bytes (two blocks). This fa- 
cilitates a speedy search by creating a constant off- 
set or spacing between fields of the same type 
within a record, and allows for easy disk access 
with a two-block buffer. 

The first part of the Search file contains an index 
to the records in the remainder of the Search file. 
The indexing scheme was described in detail previ- 
ously [3]. 



A* 100, three 16-bit integers, the dimensions 
of a mathematically unique primitive unit 
cell equivalent to the conventional unit 
cell. 

Used by the NBS Lattice search program. 

Organic/inorganic, mineral, metals & al- 
loys, hydrate, deleted, NHx-containing, 
unit cell differs from original author's 
cell, and a spare. 

To the corresponding Reference file entry. 

In case something simple needs to be added 
in the future. 

There is one index entry for each record in the 
Search file. The Index file is 60 Kb in size. There 
are 18 compounds per 1024-byte record in the 
Search file, so each entry in this index file refers to 
18 compounds. Because the Search file is ordered 
by chemisty, the Index file makes it possible to per- 
form a coarse screening (in groups of 18) of the 
Search file to find the records which may contain 
compounds with the proper chemistry. More di- 
rectly, the index allows the search software to ap- 
ply a quick test and then most often skip over a 
group of compounds which certainly contain no 
possible matches based on the chemistry require- 
ments. This greatly reduces the number of Search 
file entries which must be processed in detail and 
can increase the overall speed of the search by as 
much as an order of magnitude. 

The structure of the file is based on our search/ 
match experience with an earlier version of this 
database. It is designed to be searched first on the 
basis of chemistry, which has been shown to be the 
primary characteristic in electron diffraction phase 
identification work [9]. The index file allows one to 
skip over large sections of the file where no chem- 
istry matches are possible, greatly reducing search 
time. After considering chemistry, we can perform 
a secondary match on the basis of observed r-spac- 
ings, or on the basis of flags indicating membership 
in one or more subsets of the data. It is also possible 
to make no requirements on chemistry, in which 
case all entries in the file will pass the chemistry 
test. Then, a search takes the maximum amount of 
time since every entry will be tested for secondary 
match requirements. It is possible to search on the 
basis of other parameters, such as space group, 
Pearson Symbol, reduced cell parameters [4], or 
unit cell parameters, although we have not devel- 



The contents of an Index entry for a given compound are: 

1. Bnum: Block number in the Search file (2 bytes). 

2. ORmap: Six 16-bit words containing the result of performing the boolean 

OR function on the chemistry bit maps for all the compounds in 
one record. 
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oped software to do so. Since the unit cell parame- 
ters are stored for most of the compounds, it is 
possible to write additional software to quickly cal- 
culate precise c?-spacings and Miller indices of al- 
lowed reflections for a particular compound if the 
need arises. 



Search/Match Software 

Source code for basic functional search/match 
software is distributed with the database. Two ver- 
sions exist. An assembly language search algorithm 
was written and described for the first generation 
of this file [3]. The general nature of the algorithm 
remains the same for this file, with minor changes 
to accommodate the format of the new database. 
Experience with typical data (2 or 3 observed ele- 
ments, all unobserved heavy elements and some 
light elements excluded, 6-8 diffraction spots) has 
shown that most searches require 10-20 seconds to 
search the full file on a PDF 11/23 equipped with 
an RL02 10 Mbyte hard disk; I/O takes several 
times longer than that. It is also possible to write 
search programs for this file in high level lan- 
guages. FORTRAN versions have implemented 
the same search on VAX, PDF, and SUN comput- 
ers. On the PDP, the FORTRAN version gives the 
same results but runs five times slower than the 
assembly language version. Similar programs could 
be written in other languages that support bit ma- 
nipulation. A version of this software has been 
written in Flextran to be integrated into the RAD 
group of programs [5] which run on computerized 
EDS analysis equipment attached to an electron 
microscope. Users are encouraged to modify or 
add to the programs. Additional software for 
searching on the basis of reduced cell [4] or space 
group may be added at a later date. 



work well for traditional x-ray diffraction analysis 
where high precision data for both peak position 
and intensity are obtained and used. 

It is anticipated that many different search/ 
match schemes will be able to use this database, 
although we have initially implemented only one. 
Searching first on the basis of qualitative EDS 
chemistry is a natural consequence of the type of 
information obtained with the AEM and greatly 
increases searching speed in a small computer. The 
computed data, incorporating high cf-spacing re- 
flections, are very diagnostic for electron diffrac- 
tion search/match identification. Beyond its 
usefulness as a search/match tool, the database also 
provides a convenient resource for crystallo- 
graphic data for pattern simulation. The full inte- 
gration of this database into our existing analytical 
software is planned, and we expect that it will be 
useful to other laboratories as well. 

The development of this database has been a 
joint project at Sandia National Laboratories and 
the National Institute of Standards and Technol- 
ogy, with the encouragement of the JCPDS/ 
ICDD. Further evolution of this database and any 
related items will be guided by the members of the 
Phase Identification by Electron Diffraction sub- 
committee of the JCPDS Technical Committee 
and the NIST Crystal Data Center. The details of 
the database format, software to generate and up- 
date the database from original source tapes, and 
the search/match software are available upon re- 
quest. The database itself is copyrighted by the Na- 
tional Institute of Standards and Technology and is 
being distributed by license through the JCPDS/ 
ICDD. For information on obtaining the database 
contact JCPDS/ICDD headquarters [1] or the 
NIST Crystal Data Center [2]. 
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